Learning rich skills through temporal abstractions without supervision of external rewards is at the frontier of Reinforcement Learning research. Existing works mainly fall into two distinctive categories: variational and Laplacian-based option discovery. The former maximizes the diversity of the discovered options through a mutual information loss but overlooks coverage of the state space, while the latter focuses on improving the coverage of options by increasing connectivity during exploration, but does not consider diversity. In this paper, we propose a unified framework that quantifies diversity and coverage through a novel use of the Determinantal Point Process (DPP) and enables unsupervised option discovery explicitly optimizing both objectives. Specifically, we define the DPP kernel matrix with the Laplacian spectrum of the state transition graph and use the expected mode number in the trajectories as the objective to capture and enhance both diversity and coverage of the learned options. The proposed option discovery algorithm is extensively evaluated using challenging tasks built with Mujoco and Atari, demonstrating that our proposed algorithm substantially outperforms SOTA baselines from both diversity- and coverage-driven categories. The codes are available at https://github.com/LucasCJYSDL/ODPP.
translated by 谷歌翻译
用于预测神经影像数据的深度学习算法在各种应用中显示出巨大的希望。先前的工作表明,利用数据的3D结构的深度学习模型可以在几个学习任务上胜过标准机器学习。但是,该领域的大多数先前研究都集中在成年人的神经影像学数据上。在一项大型纵向发展研究的青少年大脑和认知发展(ABCD)数据集中,我们检查了结构性MRI数据,以预测性别并确定与性别相关的大脑结构变化。结果表明,性别预测准确性异常高(> 97%),训练时期> 200,并且这种准确性随着年龄的增长而增加。大脑区域被确定为研究的任务中最歧视性的,包括主要的额叶区域和颞叶。当评估年龄增加两年的性别预测变化时,揭示了一组更广泛的视觉,扣带和孤立区域。我们的发现表明,即使在较小的年龄范围内,也显示出与性别相关的结构变化模式。这表明,通过查看这些变化与不同的行为和环境因素如何相关,可以研究青春期大脑如何变化。
translated by 谷歌翻译
自我监督学习(SSL),作为新出现的无监督的代表性学习范式,通常遵循两阶段的学习管道:1)学习不变和歧视性表示,并具有自动宣传借口,然后是2)下游任务。这样的两个阶段通常分别实施,这使得学到的表示对下游任务的不可知论。目前,大多数作品都致力于探索第一阶段。鉴于,关于如何使用已经学习的表示形式学习有限的标记数据的如何学习下游任务的研究较少。尤其是,从不同的借口中选择性地利用互补表示来实现下游任务至关重要和具有挑战性。在本文中,我们从技术上提出了一种新的解决方案,利用注意力机制适应任务的适当表示。同时,诉诸于信息理论,我们从理论上证明,从不同借口收集代表比单个借口更有效。广泛的实验验证了我们的方案在收集知识并缓解下游任务中的负面传递方面显着超过了当前的基于借口匹配的方法。
translated by 谷歌翻译
已经开发了覆盖选项发现,以通过连接国家过渡图的Fiedler向量提供的嵌入空间中最遥远的状态,以改善具有稀疏奖励信号的单个奖励​​信号的增强学习的探索。但是,这些选项发现方法不能直接扩展到多代理方案,因为关节状态空间随系统中的代理数量而呈指数增长。因此,现有关于在多代理方案中采用选项的研究仍然依赖单代理选项发现,并且未直接发现可以改善代理联合状态空间连通性的联合选项。在本文中,我们表明,确实可以直接计算代理商之间具有协作探索性行为的多代理选项,同时仍然享受易于分解的便利。我们的关键思想是将联合状态空间近似为Kronecker图 - 单个代理的状态过渡图的Kronecker乘积,我们可以使用单个试剂的拉普拉斯谱的“联合状态空间”的Fiedler vector,以此为基础,该图可以直接估计。过渡图。这种分解使我们能够通过鼓励代理连接对应于估计的联合Fiedler载体的最小值或最大值来有效地构建多代理联合选项。基于多代理协作任务的评估表明,在更快的探索和较高的累积奖励方面,提出的算法可以成功识别多代理选项,并显着优于使用单代理选项或没有选项的先前工作。
translated by 谷歌翻译
我们考虑临床应用异常定位问题。虽然深入学习推动了最近的医学成像进展,但许多临床挑战都没有完全解决,限制了其更广泛的使用。虽然最近的方法报告了高的诊断准确性,但医生因普遍缺乏算法决策和解释性而涉及诊断决策的这些算法,这是关注这些算法。解决这个问题的一种潜在方法是进一步培训这些模型,以便除了分类它们之外,除了分类。然而,准确地进行这一临床专家需要大量的疾病定位注释,这是对大多数应用程序来实现昂贵的任务。在这项工作中,我们通过一种新的注意力弱监督算法来解决这些问题,该弱势监督算法包括分层关注挖掘框架,可以以整体方式统一激活和基于梯度的视觉关注。我们的关键算法创新包括明确序号注意约束的设计,实现了以弱监督的方式实现了原则的模型培训,同时还通过本地化线索促进了产生视觉关注驱动的模型解释。在两个大型胸部X射线数据集(NIH Chescx-Ray14和Chexpert)上,我们展示了对现有技术的显着本地化性能,同时也实现了竞争的分类性能。我们的代码可在https://github.com/oyxhust/ham上找到。
translated by 谷歌翻译
近年来,在挑战的多跳QA任务方面有令人印象深刻的进步。然而,当面对输入文本中的一些干扰时,这些QA模型可能会失败,并且它们进行多跳推理的可解释性仍然不确定。以前的逆势攻击作品通常编辑整个问题句,这对测试基于实体的多跳推理能力有限。在本文中,我们提出了一种基于多跳推理链的逆势攻击方法。我们将从查询实体开始的多跳推理链与构造的图表中的答案实体一起制定,这使我们能够将问题对齐到每个推理跳跃,从而攻击任何跃点。我们将问题分类为不同的推理类型和对应于所选推理跳的部分问题,以产生分散注意力的句子。我们在HotpotQA DataSet上的三个QA模型上测试我们的对抗方案。结果表明,对答案和支持事实预测的显着性能降低,验证了我们推理基于链条推理模型的攻击方法的有效性以及它们的脆弱性。我们的对抗重新培训进一步提高了这些模型的性能和鲁棒性。
translated by 谷歌翻译
我们介绍了课程学习算法,变分自动课程学习(VIVL),用于解决具有挑战性的目标条件的合作多功能增强学习问题。我们通过变分的角度激励我们的范式,其中学习目标可以分解为两种术语:任务学习当前任务分发以及新任务分发的课程更新。第二任期内的本地优化表明,课程应该逐步扩展培训任务,易于努力。我们的Vivl算法用两个实际组件,任务扩展和实体进展实现了这种变分的范例,它在任务配置以及任务中的实体数量产生培训课程。实验结果表明,Vacl解决了大量代理商的稀疏奖励问题的集合。特别是,使用单个桌面机器,VACL在简单扩展的基准测试中实现了100个代理的98%覆盖率,并再现最初在Openai隐藏项目中显示的斜坡使用行为。我们的项目网站位于https://sites.google.com/view/vacl-neurips-2021。
translated by 谷歌翻译
本文回顾了关于压缩视频质量增强质量的第一个NTIRE挑战,重点是拟议的方法和结果。在此挑战中,采用了新的大型不同视频(LDV)数据集。挑战有三个曲目。Track 1和2的目标是增强HEVC在固定QP上压缩的视频,而Track 3旨在增强X265压缩的视频,以固定的位速率压缩。此外,轨道1和3的质量提高了提高保真度(PSNR)的目标,以及提高感知质量的2个目标。这三个曲目完全吸引了482个注册。在测试阶段,分别提交了12个团队,8支球队和11支球队,分别提交了轨道1、2和3的最终结果。拟议的方法和解决方案衡量视频质量增强的最先进。挑战的首页:https://github.com/renyang-home/ntire21_venh
translated by 谷歌翻译
In the era of Internet of Things (IoT), network-wide anomaly detection is a crucial part of monitoring IoT networks due to the inherent security vulnerabilities of most IoT devices. Principal Components Analysis (PCA) has been proposed to separate network traffics into two disjoint subspaces corresponding to normal and malicious behaviors for anomaly detection. However, the privacy concerns and limitations of devices' computing resources compromise the practical effectiveness of PCA. We propose a federated PCA-based Grassmannian optimization framework that coordinates IoT devices to aggregate a joint profile of normal network behaviors for anomaly detection. First, we introduce a privacy-preserving federated PCA framework to simultaneously capture the profile of various IoT devices' traffic. Then, we investigate the alternating direction method of multipliers gradient-based learning on the Grassmann manifold to guarantee fast training and the absence of detecting latency using limited computational resources. Empirical results on the NSL-KDD dataset demonstrate that our method outperforms baseline approaches. Finally, we show that the Grassmann manifold algorithm is highly adapted for IoT anomaly detection, which permits drastically reducing the analysis time of the system. To the best of our knowledge, this is the first federated PCA algorithm for anomaly detection meeting the requirements of IoT networks.
translated by 谷歌翻译
In recent years, the field of intelligent transportation systems (ITS) has achieved remarkable success, which is mainly due to the large amount of available annotation data. However, obtaining these annotated data has to afford expensive costs in reality. Therefore, a more realistic strategy is to leverage semi-supervised learning (SSL) with a small amount of labeled data and a large amount of unlabeled data. Typically, semantic consistency regularization and the two-stage learning methods of decoupling feature extraction and classification have been proven effective. Nevertheless, representation learning only limited to semantic consistency regularization may not guarantee the separation or discriminability of representations of samples with different semantics; due to the inherent limitations of the two-stage learning methods, the extracted features may not match the specific downstream tasks. In order to deal with the above drawbacks, this paper proposes an end-to-end deep semi-supervised learning double contrast of semantic and feature, which extracts effective tasks specific discriminative features by contrasting the semantics/features of positive and negative augmented samples pairs. Moreover, we leverage information theory to explain the rationality of double contrast of semantics and features and slack mutual information to contrastive loss in a simpler way. Finally, the effectiveness of our method is verified in benchmark datasets.
translated by 谷歌翻译